Non-homogeneous distributed storage systems 



Vo Tarn Van, Chau Yuen Jing (Tiffany) Li 

Singapore Univ. of Tech. and Design Lehigh University 

Email: {tamvan_vo,yuenchau@sutd.edu.sg} @sutd.edu.sg Email: jingli@lehigh.edu 



(N- 

o: 

(N. 
< 



CZ2 



> 

00 

o 

00 

o 



X 



Abstract — This paper describes a non-homogeneous distributed 
storage systems (DSS), where there is one super node which 
has a larger storage size and higher reliability and availability 
than the other storage nodes. We propose three distributed 
storage schemes based on (fe + 2, k) maximum distance separable 
(MDS) codes and non-MDS codes to show the efficiency of such 
non-homogeneous DSS in terms of repair efficiency and data 
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availability. Our schemes achieve optimal bandwidth 
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repairing 1-node failure, but require only one fourth of the 
minimum required file size and can operate with a smaller field 
size leading to significant complexity reduction than traditional 
homogeneous DSS. Moreover, with non-MDS codes, our scheme 
can achieve an even smaller repair bandwidth of ||. Finally, we 
show that our schemes can increase the data availability by 10% 
than the traditional homogeneous DSS scheme. 

Index Terms — Exact-repair MDS codes, non-homogeneous 
DSS, minimum storage regenerating (MSR) codes. 

I. Introduction 

Distributed storage systems (DSS) are widely used today 
for storing data reliably over long periods of time using a 
distributed collection of storage nodes, which may be indi- 
vidually unreliable. Application scenarios include large data 
centers such as Total Recall [4], OceanStore [12] and peer-to- 
peer storage systems such as Wuala [9], that use nodes across 
the Internet for distributed file storage. 

One of the challenges for DSS is the repair problem: If a 
node storing a coded piece fails or leaves the system, in order 
to maintain the same level of reliability, we need to create 
a new encoded piece and store it at a new node with the 
minimum repair bandwidth. To solve this problem, Dimakis 
et al. introduced a generic framework based on regenerating 
codes (RC) in [2]. RC use ideas from network coding to define 
a new family of erasure codes that can achieve different trade- 
offs in the optimization of storage capacity and communication 
costs. The optimal tradeoff curve for achievable codes has two 
extremal points which are of particular interest: the minimum 
storage regenerating (MSR) codes with minimum possible 
storage size for a given repair capability, and the minimum 
bandwidth regenerating (MBR) codes with minimum possible 
repair bandwidth. 

Consider a minimum storage system, where a source file of 
size M units is split into k parts, defined over a finite field ¥ q 
and stored across n nodes in the DSS. While the economy 
in storage is highly desirable, issues may arise when the 
system tries to repair failure at the optimal repair bandwidth. 
Specifically, if q or M grows to be arbitrarily large, then the 
system may become inefficient and impractical due to the 
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Figure 1. An example of traditional repairing 1-node failure. 



high computational complexity or the fast growing storage 
consumption. 

Another challenge for DSS is data availability, which is 
of critical importance to a peer-to-peer (P2P) storage/backup 
system that relies on a swarm of distributed and independent 
nodes for file storage. As the nodes not only differ in their 
storage capacity and traffic bandwidth, but they may not be 
online or available at all times. Hence, there is a pressing 
need to increase the data availability, such that infomation 
is available with a probability approaching 1. Clearly, P2P 
enrironments are heterogeneous by nature, and code design for 
such systems must explicitly account for this heterogeneity. 

The primary interest of this paper is to study a non- 
homogeneous DSS, where one "super node" has a larger 
storage size and higher reliability and availability than the 
other storage nodes. We study a class of high-rate (k + 2, k) 
MDS storage codes, and show that with MDS code such non- 
homogeneous DSS can achieve the optimal bound in [2] when 
repairing single or double-node failures, but require smaller M 
and q than the traditional homogeneous model in [ 1 ] . Another 
proposed scheme based on non-MDS codes is shown to repair 
1-node failure below the optimal repair bandwidth bound in 
[2]. Moreover, we show that our proposed non-homogeneous 
DSS schemes can achieve a higher data availability than the 
traditional homogeneous DSS scheme. 

This paper is organized as follows. Section II shows the 
definition of non-homogeneous DSS. Section III shows three 
proposed schemes of exact repair with (k + 2, k) storing codes 
in non-homogeneous DSS. Section IV shows the numerical 
results of our schemes and the comparison with previous 
methods. Finally, the paper is concluded in Section V. 



II. Models of distributed storage systems 

In this section, we present a brief review of the traditional 
homogeneous DSS proposed in [2]. Then, a new model of non- 
homogeneous DSS is proposed to realize the practical DSS. 

A. Model of traditional homogeneous DSS 

We follow the definition of traditional homogeneous DSS 
using (n,k,d,a,j) regenerating codes over finite field ¥ q . 
This network has n storage nodes and every k nodes suffice 
to reconstruct all the data. The size of the file to be stored is 
M units' and partitioned into k equal parts fi, ■ ■ • , f& € 
where N = After encoding them into n coded parts using 
an (n, k) maximum distance separable (MDS) code, we store 
them at n nodes. 

We define here the MDS property of a storage code using 
the notion of data collectors as presented in [2]. A storage 
code where each node contains worth of storage, has the 
MDS property if a data collector can reconstruct the all the 
M units by connecting to any k out of n storage nodes. 

When a node fails, the data stored therein is recovered 
by downloading (3 packets each from any d (> k) of the 
remaining (n — 1) nodes; the total repair bandwidth is then 
7 = d(3 as shown in Fig. 1. It has been shown in [2] that there 
exists an optimal tradeoff between the storage per node, a, and 
the bandwidth to repair one node, 7. In this paper, we focus 
on the extreme point where the smallest a = 4£ corresponds 
to a minimum-storage regenerating (MSB) code. 
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To minimize jmsr, let d = n — 1 and we get 

^msr,1msr) = (f>f -^)- In the case of high-rate 
codes n = k + 2, a lower bound for repair bandwidth 71 of 
1-node failure was shown as [2]: 

. M n - 1 M k + 1 

^ = ^- 1 )^TW = T-^- (2) 

B. Model of the proposed non-homogeneous DSS 

Definition 1. A non-homogeneous DSS with the parameter 
(n,k,h) is a distributed storage systems with h nodes based 
on (n, k) storing codes and the amount of data stored and 
downloaded from any nodes are variable. Node i in the 
network stores cti > 4£ units. When node i fails then it is 
repaired by downloading f3j packets from node j, j E {n} \i. 
□ 

It is clear that we must have f3j < a.j for all j ^ i since 
a node can not transmit more information than it is storing. 
When n = h,cti = a, /3j = (3 for all i, j ^ i, we obtain the 
traditional homogeneous DSS. When n > h, there are more 
redundant blocks than the storage nodes. The storage process 
has to decide which node(s) to store more blocks. 

Example 2. In this paper, we present the idea of non- 
homogeneous DSS using the following setting: there is one 

'We use "packets", "units", "blocks" interchangeably. 



big node, called the super node, which has a larger storage 
capacity and higher reliability and availability than the other 
nodes. Such scenario is possible in practical system, e.g. 
in a peer-to-peer backup system, the super node could be 
the service provider that has higher availability and provides 
higher storage capacity than other peers. 

Consider a system with one super node and three other 
storage nodes non-homogeneous DSS based on a (5, 3) MDS 
code, which can be denoted as (n = 5, k = 3, h = 4). Assume 
a file of size M = 6, then this file is divided into k = 3 parts, 
each part containing N = 4£ = 2 packets. After encoding 
them into 5 encoded parts or 10 packets, we store the first 
2N = 4 packets in the super node, and each of the remaining 
three nodes stores TV = 2 packets as shown in Fig. 2. 
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Figure 2. An example of non-homogeneous DSS based on (5, 3) MDS codes 
and 4 storage nodes. (Node 1 is the super node.) 

III. Exact repair of (fc + 2, k) storing codes in 

NON-HOMOGENEOUS DSS 

In this paper, we limit our study to high-rate (n — k + 
2, k) exact-repair storing codes. This homogeneous problem 
has been considered in [1], [5], [14]. We propose three efficient 
DSS schemes using MDS and non-MDS storage codes in such 
(n = k + 2, k, h = k + 1) non-homogeneous DSS, which are 
denoted as Scheme A, B, and C in Table. I. Scheme A and C 
use MDS codes while Scheme B uses non-MDS codes. The 
new system consists of (k + 1) nodes which include k nodes 
of storage size N and one super node of size 2N. 

A. Scheme A: Store two systematic data at the super node 
with MDS codes. 

It can be seen in Table. I, the first (k — 1) storage nodes 
of Scheme A store the systematic file parts fi , • • ■ , ffc where 
fi, f2 are stored in the same systematic node s% and the other 
file parts {3 , • • • , are stored individually in the remaining 
(k — 2) systematic nodes S2, • • • , Sk-i, respectively. The first 
and second parity p\, P2 store a linear combination of all 
k systematic parts as fiAi + • • • + ffcAfc and fiBi + • • • + 
ffeBfe. Here, Aj and denote an TV x N matrix of coding 
coefficients defined over finite field F 9 . 

. Repair systematic failure nodes for Scheme A: 

We first consider repairing 1-node failure (in the case of super 
node si fails, it is considered as 2-node failures which will be 
discussed in more detail later). Without loss of generality we 
assume node S2 that contains {3 is failed. For simplicity, we 
first consider the case (n = 5, k = 3). To recover desired data 
f3, we have to download the following equations from the two 
survival parity nodes: 



Table I 

Three schemes of (fe + 2, k, k + 1) non-homogeneous model vs. traditional model based on (k + 2, fc) MDS codes where S and P are 

THE ABBREVIATION OF SYSTEMATIC AND PARITY, RESPECTIVELY. HERE, f; S Fj XjV AND A;, B; 6F, " FOR ALL 1 < i < k. NOTE THAT ALL 

Schemes A, B and C use only (k + 1) storage nodes to store (k + 2) packets. 
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for all 1 < i < k and V 1 , V 2 e 
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can be derived based on the failure node. To repair 
different failure nodes, different V 1 ,V 2 are needed which 
can be precalculated. It can be seen from Fig. 3 that the 
term (fiAnV 1 + faAaV 1 ) and (fiBiV 2 + f 2 B 2 V 2 ) are re- 
movable by downloading (y + -y) packets from super node. 
Therefore, the desired data f 3 can be recovered if the following 
rank constraint is satisfied: 



rank [A 3 V\ B 3 V 2 ] = N 



(4) 



To recover the desired data f 3 in the general (n = k + 2, k) 
case, we have to use the following equations: 



fiAiV 1 + f 2 A 2 V! + faAsV 1 
fiBiV 2 + f 2 B 2 V 2 + f 3 B 3 V 2 



ffeAfcV 1 
ffeBfeV 2 



(5) 



Similarly, the term (fiAiV 1 + f 2 A 2 V 1 ) and 
(fiBiV 2 + f 2 B 2 V 2 ) are removable by downloading 
(t + t) P ac kets from super node 1. The following 
conditions must be satisfied to achieve the optimal repair 
bandwidth: 
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To relax the complexity of the constraints found in (6), 
we set Aj = Ijy and V 1 = V 2 , then obtain the following 
equations: 
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(7) 



rank [BfeV 1 , V 1 ] = 
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Figure 3. An example of repairing 1-node failure using Scheme A based on 
(5, 3) MDS codes in non-homogeneous DSS 



The problem of finding matrix B^ is similar to [1]. However, 
here we only need to solve (k — 2) equations. Therefore, the 
fragment size and finite field will be smaller, with M = 2 k ~ 1 k 
(which means the fragment size reduce to i of the traditional 
homogeneous model when k > 3), and q = 2k — 1. These 
advantages allow us to reduce the minimum size unit of storing 
file and reduce the complexity of computation to a smaller 
finite field. It can be seen that in this case of 1-node failure, the 
proposed Scheme A can achieve the optimal repair bandwidth 
of *±1M, which is the same as traditional homogenous DSS 
scheme. 

• Repair first parity node for Scheme A: 

If the first parity node pi fails, we make a change of variable 
to obtain a new representation for our code such that the first 
parity pi becomes a systematic node in the new representation. 
We make the change of variables as follows: 



J2 f* = Y3, is = y s for 1 < 3 + 3 < k (8) 

1=1 

We solve (8) by replacing f 3 in terms of the y, variables and 
obtain 

f3 = y3 - (yi + y2 + y-4 H H y*) 



The problem of repairing first parity is equivalent to repair 
systematic node y3 in the new presentation. Note that yi , y2 
are stored in the same node since they are correspondent 
to fi,f2. To repair y3, we have to download the following 
equations from node s 2 and p 2 : 



(-yi) + (-y 2 ) + y3 + --- + (-yfe) 

(Bi - B 3 )yi + (B 2 - B 3 )y 2 + B 3 y 3 + • • • + (B fc - B 3 )y fe 

Again, the V-^V 2 matrices need to satisfy the following 
conditions in order to achieve the optimal repair bandwidth. 
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(B 4 -B 3 )V\ V 1 ] 
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(9) 



rank [(B fc -BgjV 1 , V 1 ] 

Similar to the systematic case, the solution of matrix B^ 
is similar to [1]. However, we need to solve only (k — 2) 
equations, which means the fragment size and the finite field 
will be smaller M = 2 fe ~ 1 fc, and q = 2k - 1. 

. Repair second parity node for Scheme A: 
Similar to the above, we rewrite this code in a form where the 
second parity is a systematic node in some presentation 
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where f is a full rank row transformation of f . We proceed in a 
way similar to how we handled the first parity repair to achieve 
the optimal repair bandwidth. A similar set of equations to the 
case of repairing the first parity node can be obtained as shown 
below. 
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It can be seen that in this case the size and the finite field will 
be again M = 2 k ~ 1 k, and q = 2k — 1, which are smaller than 
those in [1], and still achieve the optimal repair bandwidth. 
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Figure 4. Illustration of exact repair of 2-node failure for a (5,3) Exact-Repair 
MDS code for Scheme A in non-homogeneous DSS. Total repair bandwidth 
72 = M + achieves lower bandwidth bound. 



. Repair 2-node failures for Scheme A 

To repair 2-node failure at the optimal repair bandwidth, one 
solution is shown in Fig. 4. Lets assume that s 2 and pi fail, to 
repair them, first download the k packets from the survival 
nodes, then the original file can be recovered due to the 
property of MDS codes. Therefore, we can obtain the data 
of node s 2 and pi, and store them in new node, say new p\. 
Next, the data of the failure node s 2 (i-e- f"3 in this case) is 
forwarded to the new node s 2 . The total repair bandwidth will 
be 72 = M + t. It is trivial to repair the super node si at 
the repair bandwidth of M by downloading data from survival 
nodes. It should be noted that the failure of one super node 
plus one additional node cannot be repaired since it can be 
regarded as three-node failure, therefore beyond the correcting 
ability of (k + 2, k) MDS codes. 

B. Scheme B: Store two systematic data at the super node 
with non-MDS codes. 

Scheme B uses the same model as Scheme A. However, 
we can achieve the repair bandwidth of 1-node failure be- 
low the optimal bound in this non-homogeneous model if 
the term (ft AiV 1 + i 2 A 2 V 1 ) and (fiBjV 2 + f 2 B 2 V 2 ) in 
(5) are the same or the following constraints are satisfied 
AiV 1 = ABiV 2 ,A 2 V 1 = AB 2 V 2 . It means that we only 
need to download packets instead of (y + t) P ac k ets 
from the super node to eliminate these terms. The following 
example is used to present the idea of repairing 1-node failure 
below the optimal bandwidth bound for the case k — 3, n = 5. 
Suppose fi = [ai,a 2 ] T ,f 2 = [&i,6 2 ] T ,f 3 = [ci,c 2 ] T and 
Pi = fiAi +f 2 A 2 +f 3 A 3 , p 2 = fiBi +f 2 B 2 +f 3 B 3 are the 
systematic and parity data of a (5, 3) storage code over finite 
field F 3 where 



Ai 
Bi 

1 Z Z 1 U 1 

(12) 

It can be seen that any 1-node failure (systematic or parity 
node) except the super node can be repaired with bandwidth 
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Figure 5. Total repair bandwidth of 1-node failure 71 = 4|- is smaller than 
the bound. In this example, 71 = 3 < 4 of repair bandwidth in the traditional 

case 
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Figure 6. An example of repairing 1-node failure using Scheme C based on 
(5, 3) MDS codes in non-homogeneous DSS 



of which is below the optimal bound (2^^)- Fig 5 
shows the process of using two projection vectors Vi = 



,V a = 



for repairing one systematic node failure 

below the optimal Bandwidth bound. It is straightforward for 
the general case (k + 2, k). In the case of systematic 1-node 
failure, the general design constraints for Scheme B is: 
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(13) 



rank [A* V 1 , B fe V 2 ] = f 



Similar to Scheme A, we can find the solution for scheme B. 
However, in Scheme B the MDS property of the storage code 
breaks since we cannot reconstruct the original information 
from the survival nodes in the case of 2-node failure. 

C. Scheme C: Store two parity data at the super node with 
MDS codes. 

We first consider an example with n — 5,k — 3 for 
simplicity. Without loss of generality, assume that node 1 with 
data fi is failed and the two parity packets pi, P2 are stored at 
the super node. To recover fi, we have the following equations 
after eliminating {2 and f3 from the parity node: 



fiAxV 1 
fiBiV 2 



f 2 A 2 V 1 
f 2 B 2 V 2 



fsAaV 1 
fcBaV 2 



fiCxV 1 
fiDiV 2 



f 2 c 2 v i 

f 3 D 2 V 2 

(14) 
A1A3 1 - 
AiA^ 1 - 

2 , — ^-3^1-2 — s -''A'-'2 ■ It can b e seen fr° m Fig- 6 
that the term f 2 C 2 V 1 and fjD 2 V 2 are removable by down- 
loading + ^) packets from the parity node. Therefore, 
the desired data fi can be recovered if the following rank 
constraint is satisfied: 



where C^D, G ¥^ xN for 
B 1 B 3 - 1 ,C 2 - A^ 1 - 
B1B7 1 , D 2 = A3A7 1 - B 3 B 



= 1,2 and Ci = 
B 2 B 3 ^ 1 ,D 1 = 



rank[CiV\ D1V 2 ] = N 



(15) 



For general (k + 2, k) case, similar to Scheme A, we set 
Aj = Ijv for all i < N. To recover the desired data fi, we 
have the following equations reduction from parity node: 



f 2 (B 2 -B 3 )+E| = 4fi(Bi-B 3 ) 
f3(B 3 -B 2 ) + E- =4 fi(B J --B 2 ) 

(16) 

The following conditions must be satisfied to achieve the 
optimal repair bandwidth of ^j^x- 
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(17) 



rank[(B fc -B 2 )V\ (B fc -B 3 )V 2 ] = 



N 



In general, solving (17) is still an open problem. Here, we 
give a numerical solution for the case n = 6, k = 4. Consider 

fi = [ai,a 2 ] T ,f 2 = [6i,6 2 ] T ,f3 = [ci,c 2 ] T ,f 4 = [ci,c 2 ] T 



and pi = fi +f 2 + f 3 + f 4 , p 2 = fiBi 
are the systematic and parity data of a ( 
finite field F 3 where 



f2B2 + f 3 B3 + f 4 B 4 

, 4) storage code over 
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It can be seen that any 1-node failure except the super 
node can be repaired with optimal bandwidth ( 2 ^Tr)- Fig. 
7 shows the process of using two projection vectors V 1 = 

!? for repairing the first systematic node. 

For the case of super node pi and 2-node fail, the repair 
process is similar to scheme A. The total repair bandwidth 
will be M and M + respectively. 
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Figure 7. An example of repairing 1-node failure using Scheme C based on 
(6, 4) MDS codes in non-homogeneous DSS 



IV. Performance Analysis 

As compared with the previous work in [1], our schemes A 
and C can achieve optimal repair bandwidth of 1-node failure 
at a smaller finite field q and 75% smaller data size M than 
[1]. Moreover, Scheme B that uses non-MDS code can repair 
1-node failure with M- smaller bandwidth than the optimal 
bound. A summary is presented in Table II for the various 
technologies. 

A. Numerical Case Study 

To show an illustration, we continue using the example n = 
5, k = 3. Assume we have the data file of size 48 blocks to 
store across the DSS. 

Scheme A and C: Divide the file into 4 fragments of size 
Mi = 12 blocks. These fragments are stored across k + 1 = 4 
nodes in the non-homogeneous DSS. In the case of 1-node 
failure, the repair bandwidth will be 4 x = 32 blocks. 

In the case of 2-node failure, the repair bandwidth will be 
4 x (Mi + ^f) =64 blocks. To update one fragment Mi of 
the file, the update bandwidth will be = 20 blocks. 

Scheme B: Similar to Scheme A, the file is divided into 4 
fragments of size Mi = 12. In the case of 1-node failure, the 
repair bandwidth will be 4 x ^ = 24. To update one fragment 
Mi of the file, the update bandwidth will be ^-n = 20. 

Alex method [1]: Divide the file into 1 fragment of size 
Mi = 48. The repair bandwidth of 1-node failure will 
be 1 x (^Y) = 32 and for 2-node failure requires 
1 x (A/2 + ^jr) — 64. The update bandwidth of one fragment 
M 2 will be = 80. The repair and update bandwidth of 
other methods are computed in the same manner and shown 
as in Table. III. Note that C.R.C method cannot repair 1-node 
failure with optimal bandwidth. 

From Table. Ill, it can be seen that Schemes A and C 
can achieve the optimal bandwidth for repairing 1- or 2- 
node failure, which is the same as homogeneous DSS. By 
scarifying the MDS property, Scheme B requires a lower repair 
bandwidth for 1-node failure. It can be seen that all proposed 
schemes have an advantage of small update bandwidth in 
compare with the other schemes except the CRC method. 
However, the CRC method is not practical since it cannot 



achieve the optimal repair bandwidth in the case of 1-node 
failure. [5] and [14] methods are also impractical since they 
can repair only the systematic nodes. 

B. Data Availability 

In this subsection, we employ the framework proposed in 
[16] to measure and compare the data availability between 
our proposed non-homogenous DSS schemes and traditional 
homogenous DSS schemes to show the efficiency of our 
proposed schemes. Let [pi, • • • ,Ph] be the nodes' online prob- 
ability of h nodes in the (n, k, h) DSS. Let the power set of h, 
2 h , denote the set of all possible combinations of online nodes. 
Let A C 2" represents one of these possible combinations. 
Then, we will use Qa to represent the event that combination 
A occurs. Since node availabilities are independent, we have 



Pr[Q A } = Y[pi I] 0-~Pi) 

ieA je2 h \A 



(19) 



Let Xi be the number of data blocks stored in 
storage node i, for example Xi — 1, it means 
cti = The data allocation of our schemes will be 

{xi = 2,x 2 = 1, • • • , x k +i = 1, x k +2 = 0). Let L k C 2 h be 
the subset containing those combinations of available nodes 
which together store k different redundant blocks. 



L k = iA 



A e 2 h ^ Xl > k 



(20) 



Since the retrieval process needs to download k different 
blocks out of the total n redundant blocks, the probability 
of successful recovery for an allocation (xi, • • • , x n ) can be 
measured as 



Pr [successful recovery] = Y^AeL ^ >r [Qa] 

= Yl<AeL k TlieAPi T[je2 h \A (1 ~Pj) 



(21) 



To compare the data availability, we examine a scenario of 
node online probability where the online probability of super 
node is greater than the other node pi > P2 = P3 = ■ ■ ■ = 
p n = p- The data availability of homogeneous 
scheme in [1]) and non-homogeneous Pr non -h omo DSS (e.g. 
Schemes A and C, since Scheme B is based on non-MDS code, 
it is excluded in this study as its availability is calculated in a 
different manner) can be computed by the following equations: 



'' (22) 



Pr hom o = P + (& + !)(! -P)P + ■ 



Pr 



k(k + l) 



HOIl,- II <>!!><> 



/•+fc P i(i- P )/- 1 +^-^pi(i- P )V- 1 

(23) 

Let pi = XP where x > !■ The condition Pr non -homo > 
Pr ho rno will induce x > p/(p+£(l-p) [(k - 1) - (k + l)p]). 
It can be seen that if p < |=j> then p/(p + |(1 — 
p) [(k - 1) - (k + l)p]) < 1 < x- Therefore, Pr non ^ homo > 
Prhomo for all p < We run the simulations for the case 



Table II 

Comparison of non-homogeneous model vs. traditional model based on (k + 2, k) codes where M and 7 are the file size and repair 

BANDWIDTH, RESPECTIVELY. SMALL VALUE OF M, 7 AND q MEAN EFFICIENT. 





Scheme A&C 


Scheme B 


Alex [1] 


Perm, code [5] 


Tamo [14] 


C.R.C [10] 




M = 2 k ~ L k 
q > 2k - 1 


M = 2"~ L k 
q > 2k - 1 


M = 2 fc+1 fc 
q > 2k + 3 


M = 2 fc fc 
g > 2fc+ 1 


M = 2*ft 
<j > 2/c + 1 


M = 2k 
q > n 


1- node failure 


_ Mk±l 
k 2 M 


7= ^ 

7 2 




' & 9 


_ JU fc+1 
k 2 M ! 


N.A 


2- node failures 


7 = M + % 


N.A 


7 = M+% 


7=M+f 


7 = M+^ 


7 = M+^ 



Table III 

Numerical results of storing a file of size 48 blocks using the (5, 3) MDS codes in non-homogeneous and homogeneous DSS. 





Scheme A&C 


Scheme B 


Alex[l] 


Perm. code[5] 


Tamo [14] 


C.R.C [10] 




Mi = 12 


Ml = 12 


Af 2 = 48 


M 3 = 24 


Af 4 = 24 


Mb = 6 




1 > 5 


q > 5 


<j > 9 


q > 7 


9>7 


9 > 5 


1-node failure 


7 = 32 


7 = 24 


7 = 32 


7 = 32 


7 = 32 


N.A 


2-node failures 


7 = 64 


N.A 


7 = 64 


7 = 64 


7 = 64 


7 = 64 


update 
< 12 data block 


S = 20 


S = 20 


<5 = 80 


<5 = 40 


<5 = 40 


5 = 10 



Homogeneous 

N on -homogeneous (proposed) 




Pi 

Figure 8. A comparision of data availability between non-homogeneous DSS 
and homogeneous DSS. 

of k — 4, p = 0.6 and p — 0.65 and obtain the result 
in Fig. 8. It can be seen that for p = |=i = 0.6, data 
availability of non-homogeneous DSS scheme outperforms the 
homogeneous DSS scheme. For p = 0.65 > fej, the non- 
homogeneous schemes also have a big improvement when 
pi has a high online availability. Therefore, it can be seen 
that our proposed non-homogeneous DSS schemes achieve a 
higher data availability than the traditional homogeneous DSS. 
The gap between the two becomes larger when the online 
availability of the super node increases, e.g. when p\ is greater 
than 25% of p, the data availability of the proposed non- 
homogenous over homogenous DSS is increased by 10%. 

V. Conclusions 

We proposed three distributed storage schemes for non- 
homogeneous DSS with high rate (k + 2, k) codes. Two of 
the schemes make use of MDS code, and can achieve optimal 
repair bandwidth of ^rpTr at smaller finite field q and 75% 
smaller fragment M than [2]. Small M and q are desirable, 
because they reduce the update bandwidth and complexity. 
Another scheme based on non-MDS code can achieve a 



smaller repair bandwidth than the optimal bandwidth based on 
MDS code by M- for 1-node failure. We further demonstrate 
that in such non-homogeneous DSS, if we can ensure one 
super node with a higher online probability than the other 
nodes, we can achieve a higher data availability than the 
homogeneous DSS. 
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